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Syntax directed editors (SDE) have been built to support popular languages 
or subsets of Lhose ‘anzuages. Typically, these implementations require large 
amounts of computing resources. This work describes the design and implementa- 
tion of a SDH which requires less than 58 thousand bytes of main memory and 
supports the full C programming language. Several extant SDE models are 
examined in an effor, to define a basic set of SDE facilities. Design principles 
are combined with machine constraints to produce a plan for the implementation 
of these facilities. A sample session with the resulting editor is provided. 

The syntactic irregularities of the C programming language are examined. A, 
discussion showing how language irregularities can hamper the implementation of 


a ODE follows. A grammatical definition of the C language is included. 
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1. INTRODUCTION 


Editors have existed in various forms since the advent of the computer itself, 
The first editors were very primitive, non-inberactive models used to manipulate 


unit records of card images. As computer usage, availability, apolicanons, and 
power increased, the facilities provided by editors also grew. From the primitive 
unit record editors came the batch editors which allowed the insertion of a single 
card within a batch to make global changes to all cards in that batch. Then came 
the interactive line editors which greatly increased the power of the ediling tool, 
but were still conceptually tied to the idea of an eighty colurnn card image. Truin- 
cation of input was a serious problem with these editors. Vanable length line 
editors arrived and partly solved the problem by employing the nobion of a cuper 
line [Ref. 1] commonly set to a maximum capacity of five hundred characters. [n 
such editors, the truncation problem was delayed, but nol solved. With the arriva! 
of stream editors, the truncation problem was finally solved and the link to the 
card image or unit record broken. These editors were still inconvemient models by 
modern standards. Full screen editors followed and becarne the basis for modern, 
inLeractive editing tools. These editors allowed any line on screen to be edited 
and, more importantly, provided for immediate screen odates of changes made to 
the file. 

The full screen editor proved to be very convemient for the user, Also, the 
functions performed by such editors became very powerful. Global search and rep— 
lace commands, block moves, file insertion, and similar features were cornmonly 


available. Generally, these editors required a second, follow-on program, called 


ey, 
‘ 


a fermabier to produce the desired oursut format ot the obiect being edited The 
combination of these two programs into one was the logical extension of the full 
screen editor. The inclusion of the formatter into the editor was an abbernpt to 
provide another feature designed to help the user. Such utilities came to be known 
as structure editors because they were able tc 'mpose some type of struchure or. 
the object as it was being edited. The user was able to see the final form cf his 
project while in the process of creating the projeci. 

The ideas involved in structure editors gave rise to the concept of the !an- 
puage or syntax directed editor (GDH). These editors are primarily intended to 
support the editing of programs as opposed to text. They have a built in 
‘knowledge’ cf the syntactic rules of some iznguage and can help the user by pra- 
viding assistance with the grammatica: conventions of that language. They alse 
have the abilities of structure editors in that they produce programs in accordance 
with standard, structured programming conventions, They generally include all the 
features commonly found in modern full screen editors as well. 

oyntax directed editors tnus allow a user to create and modify 4a prograra in 
terms of the syntactic nature of the supportea language. They are able to prevent. 
Inany lexical errors anc can, in efiect, themselvec produce much of the code 
needed by the average user. As the editor knows the syntax of the supported Jan 
guage, the user 1s normally freed from entering many of the basic syntactic units 
(the common elements of a FOR statement for example) found in most programs. 
Thus, fewer keystrokes and less time are required to create programs. Because 
of these features, the syntax directed editor can ensure that many simple errors 
are routinely avoided. Thus, it can be expected that programmer productivily 


will increase as syntax directed editors become available in greater numbers. 


A problem with most of the extant SDE systems is that many support only a 
carefully selected subset of some chosen language and they generally require reat 
amounts of computer resources for support. A logical goal then would be to pro- 
duce a syntax directed editer which fully supports some language and can he 
hosted on a minimally resourced machine. One method to achieve this goal is to 
first study the extant SDE implementations in an effort to determine some set of 
basic features. Then, select a commonly available language as the target 
language. Finally, attempt to design and implement a syntax directed editer on 
a microcompuher. This approach represents the main thrust of the effort for 
this thesis work, 

The first step is to examine existing SDE implementations. An effort will be 
made to extract a basic set of features common among the different models. The 
results of this examination will then be combined with a set of engineering 
principles. These two factors will provide guidelines for the design and implemen 


tation of a small scale or baseline model syntax directed editor. 


[AIS TING SDi oo ais 


There are many existing syntax directed editors. Most of them have been 
installed on computers supporting large numbers of users. Typically, the systerns 
are academic or experimental in nature. This has an impact on the design philoso 
phy behind each implementation. As an example, if the editor is intended to be 
used In an academic environment then the editor generally includes features which 
enforce the syntactic correctness of the program as it 15 being edited. ‘such 
editors may prevent the entry of any incorrect structure. Other models force the 
user to correct every error as it occurs, That is, if an error has occurred, no 
further entry of any type is allowed until that error 1s corrected. More lenient 
versions make note of any errors and display them so that the user’s attention 1s 
drawn to them. Reverse video is often used for this purpose. Erronecus constructs 
will be displayed in this manner until they are corrected. The overhead and 
Inconvenience that can be caused by systems that enforce correctness might not 
be viewed as worthwhile in a commercial environment. 

Another result of the nature of the current implementations is that they are 
nob very space efficient. Many use a tree structure to represent \he object 
being edited because of the natural correlation between trees and hierarchica! 
structures, such as programs. The tree structure itself and the necessary code 
to move about within the tree will consume a large armount of memory. There 
are alternatives to the tree representation which are not as convenient, but 


can result in memory savings. 


10 


A third characteristic of many SDE systems is that they are often designen 
to be a single component in an integrated programming environment. The editor 
may be designed to perform parsing and lexical analysis for the cornpiler. [ft may 
provide an immediate execution facility as an outgrowth of this architecture As 
an alternative, the editor may be directly linked to or contain an interpreter 
which supports an immediate execution facility. 

None of the characteristics discussed above are truly essential to a cyntax 
directed editor. They do represent very convenient features and are included in 
most of the SDE implementations which will be discussed. However, in an eifort 
to define the basic requirements of a SDE, the correctness issue, the internal 
data structure, and the immediate execution facility should be discounted. Ther 
‘correctness’ of a program can be very dependent on the compiler which must 
ultimately generate executable code. The underlying data structure used ody the 
editor should be transparent to the user. He should be totally unaware of internal 
representations of his program. Finally, immediate execution is a very pleasing 
feature. However, it is also very expensive to implement. In a basic SDE 
implementation, this expense is best left to a separate compiler. 

With these assumptions in mind, several extant SDE implementations “will be 
reviewed. Differences in the facilities they provide as well as their user 
interfaces will be noted. The models chosen for examination are representative cf 
several design philosophies and span nearly fifteen years of SDE construction. 
Emily, MENTOR, Interlisp, the Cornell Program Synthesizer, Z, SED, and 
COPE will be discussed. With the exception of Interlisp, the discussions will 


be presented according to the chronological order of each system’s inception. 
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A. EMILY 

Emily [Ref. 2] represents one of the initial attempts to implement a syntax 
directed editor. It was designed by W. J. Hansen at Stanford University and 
was completed by 1971. The system is not a part of an integrated environment but 
is intended to support any language which can be described in terms of a Sackus- 
Naur form (BNF) grammar. Emily accepts any BNF specification and silows the 
user to creale a program by selecting preductions from the specification. [n a 
sense, Emily uses a template method of program creation down to the lowest level 
of the supported language’s constructs, As an example, suppose that the grammar 


of Figure 1 represents a language supported by Bimily. Emily keeps track of the 





(Simi f=s aval OK pIe salon 
BEGIN <stmt*> ; END | 
WHILE <expression> <stmt*> : 


<simi'> (= <otmtpa 


| 
| <stmt> <stmt*> | 
<expression> ::= (<expression>) | : 
<expression> + <expression> | | 
| <var> 
| <var> = <identifier> 


Fig. 1 Hypothetical BNF Specification 
current node in the program development tree. If the current node was <stmli> then 
the user would be presented with a menu consisting of the three options for a 
<stmt> as listed in the grammar. He would select one of the opticns by pointing 


a light pen at the desired menu option. In this manner, an entire program would 
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be created Figure 2 shows the sequence of actions necessary to enter code under 
Ba] 


the Eimily system. In Figure 2, the current node is underlined and the menu is 


listed underneath the generated code. 
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Fig. 2 Programming By Selection 


_—— 





eo a 


Eimily insures that an incorrect program will never be entered. Every state- 
ment accepted has been derived from the grammar. As a result, there could 
be a very long sequence of derivations necessary to reach the lowest levels of 
the grammar. In the example of Figure 2, only two intermediate templates had to 
be selected in order to enter the identifier, However, this chain could be 
much longer and the resulting inconvenience much greater. 

In addition to the grammar specification, Emily requires a second table which 
Hansen called the concrete syntax [Ref. 2, p. 60]. This table was used to produce 
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the display characteristics of each grammar rule. AS an example, indentation 
rules were created from this table. The two tables together naturally lead to the 
internal representation of the edited object as an abstract syntax tree. 

Emily also supports holophrasting, a term defined by Hansen which inplies 
that the program can be viewed from different levels. The details of the program 
can be suppressed and the user may inspect his code from a higher level. The 
level of detail can be set by the user. Thus, at a high level, only function 
names might be displayed. Once the user determined which function he wanted to 
examine in detail, he could reset the level and ‘zoom in’ on that function. 

The tree representation has an impact on the methods a user must master in 
order to effect modifications to his program. The methods are very similar to 
those discussed in connection with the Cornell Program Synthesizer and will not 
be presented here. Another result of the abstract syntax tree representation is 
that the user must specify cursor movement commands in terms of the tree struc- 
ture. ‘These commands are very different from those found in conventional text 
editing systems. This is a trait shared by most SDE implementations which use 
a tree as the underlying data structure. The choice of using conventional 
text, cursor commands as opposed to tree rmovement commands is currently a 
subject of some debate [Ref. 3]. The issues involved are based primarily on 


opinions and will not be presented. 


B. MENTOR 
MENTOR [Ref. 4] is a syntax directed editor which supports the Pascal 
language. The system was described by Veronique Donzeau-Gouge at the Work- 


shop on Programming Languages held at Ridgefield) Ct. in June, 1980. 
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MENTOR is not part of an integrated system, nor does it support immediate 
execution of programs. A serious shortcoming of the system is that there is no 
interactive update of the screen to reflect the changes as they occur. Instead, 
a user must request screen refreshes as needed. This drawback is particularly 
inconvenient because of MENTOR'’s tree movement cursor commands. Again, ar. 
explicit request to refresh the screen must be issued in order to ascertain the 
current cursor position. 

MENTOR accepts input as text strings. Programs are thus entered in a 
conventional manner, a character at a time. As the text is entered, MIUNTOR 
parses if into an abstract syntax tree representation. When editing, a user 
views the program in its tree form. This is a shght inconsistency in that to 
enter a program, text commands are used, but to edit the same programm, tree 
cursor movement commands must be used. 

MENTOR utilizes MENTOL, a general tree manipulation language, to ef- 
fect tree movement commands during editing. The MBNTOL commands constitute 
a language unto themselves and require some learning effort on the part of the 
user, [ome of the MENTOL commands can have a very strange appearance to the 
uninitiated) As an example, "OTXT F @ if Svi then x:=$v2 else $v3" will lock 
in a subtree denoted by the marker "@TXT" for the next if statement containing 
an assignment to the variable x in its then part. 

The MENTOR implementors based their choice of a non-interactive screen 
on portability issues. The system uses teletype compatible output so that it can 
be transported to almost any machine. Their decision to utilize a tree structure 
while editing and a text structure for creation is based on their decision not, 


to maintain two internal representations of the program {1e. one as a tree and 
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one as text) Rather, the program is unparsed and prelty--printed on the screen 
as needed. Holophrasting is available in MENTOR. Many design decisions 
reflect the engineering tradeoff of sacrificing user convenience to other 


considerations. 


C. INTERLISP 

Interlisp [Ref. 5] is a growing and constantly changing system 
that has been built over a period of years by several key people. The system 
is so well integrated that it is difficult to discuss only the features of the 
editor, Thus, a brief overview of the major facilities available in the 
Interlisp system is presented, 

The Programmer’s Assistant (PA) is the editing facility in Interlisp. 
One of the PA’s most distinctive features is that it maintains a history list, 
of the user’s input, a description of the side effects of operations, and the 
results of operations. This is a powerful mechanism which allows use of the 
REDO and UNDO commands. REDO allows a user to repeat a particular 
operation of serles of commands. UNDO negates changes and can restore 
a file to an earler state. UNDO can also be used to execute different 
versions of a file. 

Because of the highly integrated nature of Interlisp, a user of the PA 
also has access to the Do—What-I-Mean (DWIM) and Masterscope facilities. 
DWIM is a very powerful facility which, in the case of incorrect input, can 
make a good guess as to the user’s actual intent and attempt to execute the 
intended operation. Correction of spelling errors is a good example of a 


DWIM capability. Masterscope is a facility which can cross-reference user 
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programs to determine where variables are declared, functions are called, and 
similar actions occur. It is intended to assist the user in locating side effects 
created by changes to low level utilities in a large program or set of programs. 
It maintains a database of its findings for each user and can respond to such 
queries as "Who uses FOO freely", 

The structure of Lisp lends itself to the tree representation used by 
Interlisp. Programs are entered sequentially as characters but are edited as 
tree structures. This difference is not important in Lisp because of the very 
simple syntax oi the language. 

Many features in modern SDE systems were first implemented in Interlisp. 
Lisp, by nature of its structure, lends itself to many of the facilities found 
in Interlisp. These facilities do not often carry over easily to block 
structured languages. However, Interlisp is an important system thal generally 


precedes block structured editors in terms of facilities provided. 


D. CORNELL PROGRAM SYNTHESIZER 
The Cornell Program Synthesizer (CPS) [Ref. 6] is an integrated 
environment which was available as early as 1981 and is generally attributed 
to Tim Teitelbaum of Cornell, A SDE is the heart of the system. Originally, CFS 
supported PL/CS, a subset of PL/I. This SDE is a table driven model however, 
and later versions of CPS support Pascal as well | Ref. 1]. 
CPS features a hybrid type editor. Programs are represented internally as 
- tree structures. Program creation is accomplished both by direct entry and by 
selection. Templates are used in manner very similar to the Emily implementation. 


However, these templates are available only for the higher level constructs of 


a 


the supperted language. Low level components, such as identifiers and 
expressions are entered directly. Correspondingly, cursor movement is performed 
differently at the two levels. Tree movement is the rule when editing high 
level components. Character by character movement is performed at the low levels. 

Hditing programs under CPS is also accomplished at two levels. A user may 
not directly edit any template. These may only be deleted or inserted. In fact, 
the cursor 1s never allowed to rest directly on a template when in the edit, 
mode. Low level expressions, however, may be freely edited. These are parsed 
on input and placed directly into a tree structure. Modifications under this 
scheme can be more cumbersome than when a conventional text editer is being 
used. As aa example, suppose that a program contains the PL/I construct of 


Figure 3. If a user needs to place this construct within a while loop, a set 


SSS SSS PS SES] 
: IF (INDEX = LIMIT) | 


: THEN PUT SKIP LIST(LIMIT REACHED’ 


Fig. 3 PL/I Program Construct 





procedure similar to the modification methods used under Emily et be followed. 
First, the IF template must be 'clipped’ {a temporary deletion) from the program 
to be stored in a temporary buffer. Then, a WHILE template would be inserted 
at the original position of the IF statement. The cursor is then moved to the 
statement placeholder within the WHILE template. At this point, the IP 
statement is reinserted into the program. The procedure in this simple example 
is not complicated. However, if many lines of code must be moved, the technique 
can be inconvenient. To relocate code within an existing loop is more difficult. 
When using a conventional editor, one would simply move the ’HND’ statement. 
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In CPS, the ‘BEGIN’ and ’END’ are part of the same template, so neither 
can be moved independantly. Thus, the code to be included would have to be 
clipped, the ’BEGIN’, ’END’ pair would have to be expanded by inserting 4 
statement template, and then the code could be inserted. This procedure is more 
inconvenient than that which would be required by a conventional text editor. 
The tradeoff is that CPS attempts to insure that any program being edited is 
always maintained syntactically correct. In the event that correctness can not 
be insured, constructs in error are highlighted in reverse video. 

A major contribution of CPS is that it was one of the first implemen- 
tations of a SDE for a block structured language to be a part of a total 
environment. The system features an incremental compilation scheme which ailows 
Immediate execution of programs being edited. CPS also improved user convenience 
as opposed to earlier implementations. A drawback to the system 1s that dual 
modes are required to accomplish modifications. CPS has not been able to match 
the editing convenience available in conventional text editors. 

Another less than desirable feature of CPS concerns the commenting 
conventions. CPS provides a comment template which contains a statement 
template as an integral part. Thus, each comment is related to a statement, 
or block of statements, and the comment appears in a fixed position above the 
Statement it refers to. This convention does lend regularity to the appearance 
of comments. However, the user Is restricted in that he may not place a comment 
on the same line with a program statement. The convention was necessary to deal 
with the arbitrary nature of comment placement. An editor is not allowed the 


compiler’s luxury of simply throwing comments away. They must be saved and 
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recognized in their own right. Further, if comments are allowed in a totally 


arbitrary fashion, it is not possible to produce a comment template easily. 


Z is a structure editor developed at Yale University which can operate 
on program structures, given a grammatical description of the target language. 
It has been called the "95% program editor" [Ref. 7] because it can accomplish 
approximately 95% of the functions that the implementors felt a SDE should be 
able to provide. Z is capable of editing any hierarchically structured object, 
given a description of that object. It has been used to support several program- 
ming languages including Lisp, APL, Pascal, and Bliss. 

Z treats the objects it edits as text. A tree representation is not used in 
any fashion. Thus, all textual cursor movement commands are valid and every 
component of a created object is inserted sequentially, character by character. 
Z imposes structure (indentation for program objects} on its objects by means of 
a predefined table of information. This is similar to Kmily’s concrete syntax 
table. As an example, Z’s table might contain information pertaining to the 
recognition of a Begin — Kind block which would allow the editor to insure the 
correct indentation of the statements within that block. Z also ‘knows’ how to 
insure balanced parenthesis, quotation marks, and similar context tree constructs. 
The editor is also able to support holophrasting by using the indentation level 
of each statement. Z is not part of an integrated environment. [It is linked to 
a compiler in that the editor may be temporarily suspended while a compilation 
is performed. Any error messages generated by the compiler are recognized by 


Z when it regains control. A compiler is not integrated directly into 4 because 
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the tmplementors felt that "the prograrnmer is tne person best able to decide 
when his program is in a state ready for compilation" and further, that 
“existing compilers are perfectly able to locate errors" [Ref. 7, p. 5]. 

Thus, Z 1s a very convenient system to use. It features straight-forward 
textual cursor movement commands at every level and can help the user to ensure 
that simple things are done correctly. [t does not attempt to parse user Input. 
Therefore, program correctness can not be ensured. !L is a full screen editor 
and does not suffer the limitations of the MENTOR teletype approach. Z is also 
available as a general document editor. It can perform Spelling checking and 
general word processing tasks when used in the document mode. Not all of these 


features are available in the program editing mode. 


PL SED 

SED {Syntax Editor) is a SDE implementation described by Lloyd Allison 
of the University of Western Australia in 1983 [Ref. 8]. The system supports 
the Pascal language and is hosted on a PDP 11/60 computer. SHD is not part. 
of an integrated environment and is similar to the 4 system in the design 
philosophy governing the facilities that a SDE should include. 

A central idea of the SED system is that "the editor should edit and that. 
pretty—printing, type checking, and so on should be performed by optional 
commands or other programs" [Ref. 8, p. 454]. SED does not force any absolute 
standards on users. Even pretty—printing is an option which, when used, provides 
Suggestions, not absolute standards. Similarly, SEHD does not attempt te 
ensure program correctness. An effort-is made to parse user input "as well 


as possible". Allison has distinguished “long distance" and “short distance" 
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syntax checking. Shori distance refers to the area immediately surrounding a 
single stutement. A long distance situation would occur, as an example, when 
a variable reference is made somewhere in a program and the declaration for 
that variable has been deleted. SHD makes no attempt to correct or even 
identify long distance syntax errors, 

This decision exemplifies SED’s overall design philosophy. The user is 
held responsible fer the ultimate correctness of his program. Problems such 
as the dangling else are ignored because of the idea that a compiler will 
locate such errors. A philosophy very similar to that of Z 1s apparent, 
throughout the SED design. 

“HD does attempt to parse user input. As a result, an abstract syntax 
tree 1S used as an internal data structure. Templates are not available and 
all input is sequential SED has chosen this implementation in an effort, 
to reduce the storage requirements of the tree strucbures. Mach individual 
token is nob a separate node in the tree. Rather, strings containing pointers 
to substrings are used. The tree is collapsed to its textual form for off 
line storage. This requires the tree to be constructed each time an objech is 
read in to be edited. SHD also uses the tree to support cursor movement. 
commands, Textual movernent commands are generally not available. 

Thus, SHD is a less ambitious approach to the implementation cf a 
syntax directed editor. It does offer yvreater functionality than Z in that, 
OHD attempts to ensure a degree of program correctness, However, the ultimate 


arbiter of correctness is a Separate compiler. 
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ey COPE 
COPE (a Cooperative Programming Environment} is a newly completed system 
devised at Cornell University under the directicn of Richard Conway [Ref ©]. 
It is intended to offer an alhernate system to the CFS environment. COP!) ic an 
integraLed environment and is capable of immediate execution, a ferm ot 
reverse execution, and similar features. 
The COPE system dees much more than ensure program correctnesc. It 
incorrect input is received, the system ahtempts to guess the user’s intent and 


generates syntactically correct code based on that guess. It does not use 4 


f require strict character by character program 
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template system, nor does 
entry. Because the system is designed to guess a user’s intention, only a 
few tokens which correspond in some way to a program construct need be entered. 
The system will then supply correct code which corresponds most clonely to 
the fragmented input received. Whether or not this approach goes too far i: 
debatable in its own right and is beyond the scope of this discussion. 

COPE utilizes a tree struclure to represent programs. Thus, tree movement 
cursor commands are the rule. However, unlike CPS, program keywords may he 
directly edited. Any program element is subject to modification in this 
system. COPH does have a drawback regarding its commenting conventions. 
Comments are allowed only at the beginning of a block. The implementors felt. 
that good code will be self documenting and that line by line comments would 
be superfluous. 

COPE represents one extreme view of a syntax directed programming 
environment, It not only assists a user in creating correct code, but will 


achually try to generate such code for him. Conway stated that users tended 


Lo try to “goad” the system into creating desired code sequences rather than 
attempting to completely specify their own code with the syshem’s assistance. 
Thus, the design philosophy behind COPE represents the extreme end 


of the spectrum when compared to systems such as Z and SED. 


H. CONCLUSIONS 

Very divergent design philosophies are evident in the systems examined. 
The earhest systems such as Emily required a user to go to extremes when 
entering a program, but ensured that programs were always syntactically correct, 
Later systems seem to have focused more on user convenience while trying to 
provide an atmosphere stimulating the production of correct prograras., ‘Total 
program correctness is not always ensured. The COPE system takes the view 
Lhat ease of entry and syntactic correctness are equally important goals. ‘Thus, 
correct programs can be creahed from very scant user input. The systems share 
very little common ground in major design decisions. One conclusion then 1s 
that the production of syntax directed editors is still a very experimental 


subject. The chart in Figure 4 depicts the variances among the difterent 


There are a few general properties of syntax directed editors which can be 
inferred. The editor should be convenient. [t should assist the user as imuch as 
possible in creating correct and readable programs. It should not be 
significantly more difficult to use than a conventional text editor There 
must be a balance between the facilities provided and their ease of use. The 
editor should allow an abbreviated means of text entry for the common elements: 


of the supported language. Pretty—printing and holophrasting should be available 
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when desired. The SDE should be able to provide these features common in 


conventional! text editors. 





SSS SS eS Ee 

oysbem User Program = Syntax Long Immediate Holo- Comment 
View Entry Enforced Range fixecution phrasting Convention 

Checks 

Rmiuly Tree Selection Yes Yes No Yes Restricted 

MENTOR Tree sequential No No No Yes Resbtricbed 

Interlisp Tree sequential Yes Yes Interpreted Yes Pree 

el} Tree Both Yes Yes Yes Yes Reshriched 

L Text Sequential No No No Yes ree 

olD Tree Sequential Yes No No Yes 

COPE Tree Fragments Yes Yes Yes Yes Restriched 





Mig 4. Facilities Comparison 
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The implementation of these features will require engineering Lradeot{s. 
A. baseline syntax directed editor need not be as ambitious as COPE, however 


\ = hii 
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to be a true SDE, it should not be a "95% program editor’ 
tradeoffs necessary in a baseline implementation of these feahures will be 


more thoroughly discussed in the following section. 


lil MAJOR DESIGN DECI. 


A pararnount concern in the design of the system ts that ib rust be 
able to reside in the small memory common in most muicrecomputers. This 
concern must also be weighed against time constraints. An Interachive sycteri 
which provides poor response times will not be tolerated, particularly when 
the host machine 1s a dedicated resource. 

In most cases, time and space are contradictory resources; one can usually 
be traded for the other. The design of this systern must achieve a balance 
bebween the two. The test machine for the editor's implementation has only 
98,000 bytes of memory available for transient programs and 14 uses a 
relatively slow 2MHz system clock. Because of these system limitations, time 
and space constraints will impact on every major design decision. They will 
be implicitly included in the statement of every other design principle. 

There 1s a wealth of information available in the literabure regarding 
properties thai contribute to the design of a good system.. Two sources 
have influenced the design effort for this project. W. J. Hansen presented 
his "User Engineering Principles" [Ref. 2] used in the design of Emily. 
These are listed in Figure 5, Studies have been conducted af Carnegie - Mellon 
University (CMU) in an effort to determine a set of properties which 
constitute a "graceful" inberactive system interface [Ref. 10]. The results of 
their studies are presented in Pere 6. The basic maxim which can he 
extracted from these sources is that ease of use is a primary concern. The 
interface for this system assumes a user with little or no knowledge of 
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Know the user 

fiducation, Experience, and Patience 
Minimize memorigation 

oelection, not entry 

Names, not numbers 

Ensure predictable system behavior 
Optimize Operations 

Rapid execution of common operations 

Display inertia 

Recognize command parameters 
Hingineer for errors 

Provide good error messages 


ingineer out the common errors 


Fig. 5 Hansen’s Design Principles 
SSS SS SSS SS eS 
Flexible parsing -- Correct small, simple mistakes in the interface language 


Robust communications —— Avoid verbosity but let the user know when the system 
understands 


Identification from description -- The system can identify internal objects casily 
Focus tracking -—- The system tracks the user’s focus of attention 
Hxplanation facility -- The system can explain what it can do 


Personalization -- The system can adjust to the user’s desires 
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prograraming systems. As a result, the interface for the editor will be menu 
driven as much as possible. Menu selections will be designated by a sinjle 
letter {possibly accompanied by the control or shilt key) which should be 
innemonically related to the desired system action. Hansen’s “use selection, 
not entry" principle will be stressed. 

Another major principle can be inferred from beth sources, the principle 
of regularity. To ensure that a system is predictable it must be regular in 
nature. A system response should be regular across the different pocuble 
states of the system. An implication of regularity is that the syslem should 
adopt an “all or nothing" approach in the facilities that iL provides A 
particular design decision that rests on the regularity princitle concerns 
the correctness issue, An approach similar to that of SEHD should be avoided. 
Parsing the input "as well as possible" 1s not a good design decision because 
the user will nol know when he can and can not depend on editor decisions. 
Thus the time and space spent on partial parsing 1s wasted in many situations. 
Hither the CPS fall} or Z (nothing) approach should be taken. On the 
assumption that the editor will be pressed to fit into a small space, proyraim 
correchness and, thus, parsing will be left for the compiler. 

Another important design principle is that the syshem should do as much as 
iL can for the user. This principle interacts with both of those presented 
above. Biase of use implies that the user should be freed from mundane, 
rudimentary tasks which the system can perform. He should be given prompt: 
whenever possible and should have simple things done for him. The repularity 
principle linplies that, while the system should do all bhat it can, it should 


nol presume too much. If the system is designed to make too many guesses as to 


the users intent, ib could offen guess incorrectly, This would constitute a 
source of annoyance and would waste the time and space needed to penerate the 
guesses. Jhus, the system should try to do all that if can, but ib should 
only attempt those tasks which it can do correctly In most instances. 

When considering these things that can always be done correctly, attention 
becomes focused on thoce things that are integral to the targel langucajee, 
Such facilities acs balancing parenthesis and quotation marks should be included. 
Further, one knows that the keywords and major constructs of the language must. 
be used in every program. Thus, a natural tendency is to implement scme torrn 
of the template method of operations. Since the keywords and stahic consbruchs 
of the language are always the sare, a template can be used to present heir 
structure in every instance. Further, the use of templates will provide 
prompts for the user and will allow him to enter many characters with a single 
keystroke. The use of templates will correspond to another design principle 
as well, 

The user should not be required to provide the same informalion more than 
one time. Once the system has been given a correct piece of information, the 
user should be able to ‘call up’ that same information if it is needed ayain. 
Template usage satisfies this principle. The systern knows the correct Lorrnat. 
and construction of every statement and the user is able to employ this: 
knowledge directly. Another facility which will support this principle is a 
macro substitution feature. Once a user has provided some information to the 
system he should be able to store it for later reuse. If the system were .to 
save evérything automatically, a great deal of space would be wasted. However, 


if the user knows that he requires repetitive use of a code sequence, he will 
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be able to save if. Only the user will be able to predict his future require 
ments. A good macro substitution facility can help to “engineer out error’ 
by allowing correct code sequences to be used many times. Further, only a small 
amount of memory will be required by this facility. 

The system should also be amenable to user desires. If, should not dictate 
user actions. The same freedoms that are available in the conventional text 
editors should be included in the syntax directed editors as well. This 
principle has many implications. A full screen editor is a necessity. The 
telebype approach of MENTOR should be strictly avoided. The editor must 
provide convenient and powerful text entry commands. The derivation approach 
of Fimily should be discarded. ‘Since arbitrary decisions reflecting the 
preferences of the designer will not be allowed, the restrictive commenting, 
conventions of CPs and COPE are not appropriate, Because inca 
ODE will not attempl parsing or imtnediate execulion, forcing comments to be 
placed in any specific style is not justifiable, regardless of time and 
Space constraints. 

The above discussion presents the primary design principles for this 
effort. They are a blend of those presented by other system designers and 
Lhe time and space restrictions Inherent to this project. Figure 7 contains 
a condensed and integrated listing of the major principles. 

Adherence to these principles has provided many of the major design 
decisions. Composing the time and space constraints with these decisions will 
yield new systern traits. Because space is a limited resource and- user program 
correctness Is not a goal, internal representation of the user proyram as an 


abstract syntax tree would be wasteful. This imphes thal text oriented 
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commands will be used Lo move through the programs, The regularity principle. 
and the textual movement format imply that objects created by templates should 
be subject to editing operations. This situation is possible in a system that 


does not offer immediate execution or ensure correctness. Allowing templates 
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a Fe Ne 
Time and space constraints influence every decision 

Hase of learning and using the system is important 

The system must be regular in the execution of its facilities 


The system should include only those facilities that can be done well, not 
situation dependent features 


The user should not be required to restate information 


The system should be amenable to the desires of any user 


Fig. 7 Major Design Goals 


to be edited will also eliminate the need for the cumbersome modification 
methods of Emily and CPS. A drawback is that a user could ‘fool’ the editor. 


epecitically, i the editor knows that a statement is a particular hype of 
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template and the user changes the statement, then the editor will not be able 
to identify the new type of the statement. An integral parser would solve thr 
problem but would also consume space. 

Recause any construct will be subject to editing and the user should not. 
be bound to editor decisions, an alternative to the template mode of program 
creation should be available. Further, the editor may not be able to recognize 
the type of edited objects. Thus a free input mode is a necessity. vouch a 


mode will allow the user to circumvent any editor decision and will enhance 


ease of program modification. A drawback to this mode of operation is that 
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the editor will not be able to assist the user in any way. It should be noted 
that such a facility would not be practical in template systerns which attempted 
fo ensure program correctness. 

Time and space constraints affect one other major design decision, honld 
the editor be table driven so that many languages can be supported or should 
the editor be customized to recognize only one language? Clearly, if a table 
driven model is used, effort will be required to load and interpret the rules 
of the target language. Again, relying on the assumption that space is a 
paramount concern, the knowledge of the supported language should be built, 
into the editor. If a good modular design is used, there will be a pogssitulity 
of supporting many languages by altering only the language specific modules 
of the editor. Parnas’ information hiding and encapsulation principles can 
be used in this effort [Ref. i] The design of the editor should include 
a separation of functions. This will ensure an efficient use of available 
space while not totally negating the possibility of mulliple language 
support, 

Given these decisions, some language must be selected for the test 
iriplernentation. The language should be commonly available for microcomputers. 
Also, the regularity principle imphes that the full language should be sup- 
ported, not a carefully selected subset as is the case with many oDE 
Implementations (CPS and COPE as examples). While support for the full 
language is desirable, any implementation of a language which includes many 
additional features should be avoided. The time and space constraints already 
on the system are restrictive. Supporting a superset of a language could 


only add to the burden. 
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weveral languages were examined in light of these considerations, ‘The 


© programming language [Ref. 12] was selected because if adheres to the 
constraints in almost every implementation. After the detailed desyrn phase 
of the project began, iL became obvious that a hetter choice could have 
been made. Although the language is small and a superset was not involved, 
C is so ill defined that its irregularity hampered the SDK implementation. 
There does not seem to be a standard grammatical description of the language. 
Other faults were discovered. The next section of this thesis contains a. 
brief accounting of some of the language dependent problems that were 
encountered, 

Adherence to the design goals and observance of system constraints 
allowed many design decisions to be made at an early point in time. A concise 


statement of these decisions is provided in Figure 8. 
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SS eee 
A full screen editing model will be used 
The system will not be table driven 
System commands will be single key, menu driven selections 
Prompts will be provided as often as possible 
Internal storage will be based on a text format 
Textual cursor movement commands will be used 
There will be no assurance of program correctness 
There will be no parsing mechanism 


The template mode of program entry will be used and there will be an option 
to override this feature 


Template structures may be edited 

There will be no restrictions on comments 

A macro substitubion facility wall be available 

The system will generate punctuation and low level comments sutomatically 
The C programming language will be supported by built in knowledge 


Fig. 8 Karly Design Decisions 


te at ot ee 


o4 


mm OC 


C [Ref. 12] is a programming language developed at Bell Laboratories 
by Brian Kernighan and Dennis Ritchie. The language is rapidly growing in 
popularity and many systems have been implemented in C, UNIX being a 
prime example. One reason that C has been used for major applications is its 
power, power in the sense that assembly language is powerful. There is nothing 
that could be done in assembly language that could not also be done in C. In a 
major sense, C is simply a portable assembly language which includes some of the 
convenience features commonly found in modern high level languages. However, 
because C is so closely linked to assembly language, it contains many of the 
pitfalls awaiting programmers at that level. 

C contains many linguistic traps. Feature interaction and the poorly 
defined syntax can cause many errors to go unnoticed. These traits caused 
several problems in the implementation of the syntax directed editor. In order 
to examine some of the unusual design properties of the language, © will be 
analyzed in terms of good principles of program language design. Bruce J. 
MacLennan has defined sixteen such principles [Ref. 13] which are presented 
in Figure 9. Some of these principles conflict with each other and MacLennan 
has admitted that no language could satisfy each of them because of these 
interaction conflicts. However, one would be hard pressed to find a single 
language which violated more of the principles than does C. In fact, 
Kernighan and Ritchie intentionally violated some principles in an effort to 
add assembly level power to the language. 
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Abstraction -— Avoid requiring something to be stated more than once, factor 
out recurring patberns 


Automation -- Automate mechanical, tedious, error prone activities 


Defense in Depth —- lf an error penetrates one line of defense, it should be 
caught by another 


Information Hiding -- Ensure that a user has all the information needed to use 
a module and no more and the implementor has the necessary information 
to implement a2 module and no more 


Labeling -- Do not require the user to know absolute positions of objects 
Localized Cost -- A user should only pay for what he uses 
Manifest Interface -— All interfaces should be apparent 


Orthogonality —-- Independent functions should be controlled by independent 


mechanisms 
Portability —- Avoid features that are dependent on a particular machine 
Preservation of Information -- The language should allow the user to present 


information he knows and the compiler may need 


Regularity -— Regular rules, without exception, are easier to learn, use, 
describe, and implement 


oecurity -~— No program that violates the definition of the language o1 ibs 
intended structure should escape detection 


olmplicity -- A language should be as simple as possible 


otructure -— The static structure of a program should correspond in a simple 
way to the dynamic structure of computations 


oyntactic Consistency ~-— Things which look similar should be similar, things 
which look different should be different 


Zero, One, Infinity -- The only reasonable numbers are zero, one and infinity 


Fig. 9 Maclennan’s Principles of Language Design 
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First, consider the © grammatical specification contained in Appendix A, 
Lo Reference 12. These syntax charts are preceded by “This summary of C syntax 
is intended more for aiding comprehension than as an exact statement of the 
language" [Ref. 12, p.214]. This sentence is a model of understaternent. ‘The 
syntax 1S presented in a precise notation similar to BNF. However, the pre 
cision of this notation is illusionary. Often a single rule having no mere 
than five components will be accompanied by two pages of text which explain 
the various exceptions to the BNF description as presented. In such cases, 
the conceptual clarity offered by the notation is lost. The primary example 
of this situation concerns the rules for variable and type declarations, ‘There 
are only two basic types in the language. However, seventeen BN" rules and 
over sixteen pages of text {in summary form) are devoted to explanations of 
various declarations involving the two types. 

As an example, consider the legal and illegal declarations contained in 
Figures 11 and 12 respectively. Note that, once the preceding type has been 
added, the objects from both figures are easily derivable from the C declaration 
rule contained in Figure 10. The C method of specification for legal declaration: 
violates the regularity, security, and simplicity principles and is a. ross 
violation of the syntactic consistency principle. This ill defined nature al 
legal declaration forms was a major obstacle in the construction of a template 
for declarations. In this case, abiding by the principle of only doing those 
things which could always be done correctly, a very free form template was 
provided. The user receives the very terse prompt "<type><identifier list>" 
and is free to enter any structure deemed necessary. Even this very basic 


template is not sufficiently vague to avoid user inconvenience in all cases. If 
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decliratonm= 


identifier | declarator(} | 
(declarator) | *declarator | 


declarator[constant—-expression {optional}| 


ee ee ee ee 


Big. 10 Grammar Rules for Declarators 





= ee 
See i i 
rm ee eee 


F is declared to be a function returning 


int F() an integer 

char. ba a pointer to a character 

union **F() a pointer to a pointer to a union 
int (*{)) {| a pointer to an array of integers 
shiucte ( HL el @ pointer to a function returning a 


pointer Lo a structure 


“Fig, Lt Legal C Declarations 





G is declared to be a function returning 


char (G(Q))]| an array of characters 


int (FGQIO an array of pointers to functions 
rebLurning integers 


char (*GQ)Q}] a pointer to a function returning 
an array of characters 


struc ei a pointer to a function returning a structure 





"Fig. 12 Illegal C Declarations 


a shructure object whose members are initialized is declared, this template 
can be cumbersome. An example of the procedures necessary to effect this type 
of declaration is provided in the next chapher. 

Another rajor concept in C involves the notion of “Ivalues", An “object” 


is Uf 


a manipulatable region of storage" [Ref. 12, p. 183] and an Ivalue is "an 
expression referring to an object" [Ref. 12, p83] The concept of Ivalues is 
very basic to the language. Only lvalues may appear on the left side of an 
assignment statement. Because of the way in which C uses pointers and implicit 
Lype coercions, it is often not obvious when something can be considered to be 
an lvalue. The basic example of an Ivalue is an identifier name. However, other 
ie 


lvalues can be easily constructed, As an example, ‘a’ is a characber constant 


meee anced. Inen “(a + “abed’) 1s a legal lvalue and "*{’a’ + ’abced’) = 2:" 
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is a legal statement. In this case, ’a’ is converted to its integer represen: 
tation. The characters ‘abed’ are also converted to a single integer althoush 
the method of conversion varies from machine to machine. The "+" adds two 
integers. Finally, the "*" tells the compiler to refer to the contents of the 
following address. Thus, the addition of two charachers iraplicitl y coerced into 
integers is explicitly coerced into an address and becomes a legal Ivalue. ‘such 
constructions violate many of Maclennan’s principles including the securihy, 
portability, regularity, defense in depth, simplicity, and the = sbructure 
principles. Further, the irregularity of such a basic concept as which con- 
structs may appear on the left of the assignment operator causes many problems 
for the SDE, particularly one which does not include a parser. 


C is very amenable to the coercion of variables. There is a_ strict 


(although somewhat irregular) hierarchy defining the coercion rules. In the 
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event that automatic coercion is insufficient, a specific operator, the cast, 
is provided, This allows a user to freely violate the security and defense in 
depth principles. However, the cast does provide a warning of coercion and i: 
thus supportive of the structure principle. 

Another major flaw in the language is that no run time diagnostics are 
provided. The compilers will be very content to generate code which effects 
an out of bounds array assignment. These assignments can, in fact, store 
values into locations where executable code should reside. Thus, mysterious 
errors can occur and there will be no clue as to their origin. Such achions 
violate many principles, including the automation principle. Initially, an 
attempt was made to enable the editor to generate in line code which would 
provide notice of out of bounds assignments. However, this task proved to be 
extremely difficult without a knowledge of the dynamic state of the program 
in execution. Notice that in Figure 10, the bounds description in an array 
declaration may be omitted in some cases. Generally, omission is allowed 
in the first index position or when the array is explicitly initialized. An 
array declared as a formal parameter which has no bounds declaration inherits 
the bounds of the actual parameter. The editor would have no way of determining 
these bounds from the static program. Thus, the feature could not be applied in 
all cases and was omitted. 

- Arrays themselves can cause much confusion. When an array 1s declared, the 
cardinality of its possible members must be used to define the array bounds. 
That is, “char arrayname[9]" declares an array of 9 characters. Flowever, a 
reference to arrayname[9] is an out of bounds reference. Array subscripts start 


at QO while size declarations start at 1. Further, the major use of arrays in C 
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is to hold character strings. Thus, if an array was designed to hold the string 
"cat", one would assume that "char arrayname[3]" would be a proper declaration. 
This 1s not the case. The compilers recognize strings and automatically insert. 
a 0 into the position following the last character in the string. Thus, the 
declaration must account for one additional place and "char arrayname[4]" would 
be the proper declaration in this example. Referring to arrayname[3] would yield 
the 0 byte and arrayname{2| the letter ’t’. This siluation violates many 


principles and again frustrated SDH attempts to provide some protection for the 


user, 

The operators which designate assignment and equality tests represent 
another poor choice made by the C implementors. The ’=’ is the assignment 
operator while ‘==’ 1s used to designate equality tests. A natural act for a 


user constructing a program 1s to type what he thinks. Thus, iL would be very 
easy to enter “while(a = bj..." which would cause an infinite loop as long as 
the value of b was nonzero. The operator choice in this case violates the 
orthogonality and syntactic consistency principles. The editor was able toa 
provide some assistance in this case. Because of the frequency with which the 
error occurs, a specific check for assignments when a conditional exprescion 
was expected was Included. 

In general, many of the operator conventions implemented in © represent, 
violahlons of the structure, regularity, and syntachic consistency principles. 
One example concerns the cascading of operators within a single statement. 
When code such as "a=b=c=0" is used, the cascading feature is convenient, both 
for the user and the compiler, However, the cascading feature also applies to 


relational operators. A legal statement is "a<b<c". This piece of code would 
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be very convenient if Its meaning corresponded to its appearance. This cade will 
first test a and b, If a is less than b, the first part of the statement will 
evaluate to 1, 0 otherwise. Then, the value of c will be tested ayainst 1 or 0. 
A further example concerns the overloading of operators, as exemplified by the 
’® operator. Dependent on context, a single ’#’ can mean “the address of" or 
"bitwise logical and", Also, ’## means “logical and" and "x d#= 1" means 
"x = x & |", a bitwise logical and operation. 

The comma operator represents the worst C operator overloading convention. A 
single comma may have three meanings. First, 1b separates parameters in function 
calls. Secondly, if separates different expressions within a single staternent. 
Finally, if is used as a value separator in an inihialization list. The corabina- 


tion of these meanings can cause problems. The statements “call{x.y}" and 
"call((x,y)}" are totally different. The first is a funchion call with two 
arguments, the second call has only one parameter. The use of the comma tn the 
second statement supposes that x has some side effect which 1s complete In 
itself. Once evaluated, the value of x Is discarded and the value of y 1s used 
as the actual parameter. The comma convention has implications for compilers 
which attempt to build syntax trees immediately, eliminating parenthesis as 
they are encountered. Such compilers would evaluate both of the above stale- 
ments as function calls with two arguments [Ref. 14]. 

Finally, the ‘++’ and '-~’ operators deserve some mention. These are the 
increment and decrement operators. They are based on a machine instruction 
found in the machine on which C was originally implemented {a violation of the 


portability principle). These operators: can be applied only to lvalues and if 


used as "++x" have the meaning "x = x + 1" Logically then, one would assume 
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that "++{++x]" would have meaning "x = (x + 1) + 1". However, “++{++x)]" is an 


illegal construct because (++x) is not an lvalue. Further, these operators have 
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different meanings depending on their relative positions. The meaning of "++ x 
is increment x and then use its value. Conversely, "x++" means use the value 
of x and then increment x. Thus, the statements "y = ++x" and "y = x++" have 
two very different meanings although they look similar. Further confusion is 
possible if these operators are applied to pointers. Then the increment value 
Is no longer 1, but the size in terms of bytes of storage of whatever is referred 
to by the pointer. Clearly, the increment and decrement operators violate the 
structure and syntactic consistency principles. 

C does not support information hiding or encapsulation very well. The 
language is composed only of functions. A function can return, ab most, a single 
value. Thus, global variables are often required to make efficient use of 
function calls. A related point of confusion concerns functions. All actual 
parameters are passed by value, except arrays, structures, and unicns, These 
may only be passed by reference. This is but one of many exceptions to generally 
stated rules in C, 

A good example of exceptions to the basic rules in C concerns the definition 
of what constitutes a legal identifier name. The rule may be descnbed as 
follows. The first character must be a letter. Capitalized letters are different. 
from lower case letters. Identifier names may be any length, however only the 
first eight characbers are significant, unless the identifier has an external 
storage class in which case only the first seven positions may be significant 


and lower case may be considered the same as upper case, depending on the 


machine being used. This rule is not extraordinary at all and, it would seem 


) 
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that the makeup of an identifier name is a basic element which should be 
portable across mnany machines without exception. 

There are many other examples which show the irregularity of the C language. 
some scoping rules are based on the type of the declared object. There are 
poorly designed major constructs such as the switch statement. The language 
offers implicit and explicit aliasing of vanables. However, continued discus~ 
sion of these and similar issues is not the primary intent of this thesis. 

This chapter has been included in an effort to provide some insight into 
the difficulkies encountered in designing and implementing a syntax directed 
editor for the C language. At the start of the project, it was hoped that some 
degree of protection for the user might be built into the editor. However, the 
possible interaction of many of C’s irregular features made this goal extremely 
difficult. A few protective measures were included in the editor, but not as 
INany as was originally mtended. 

The purpose of this chapter is not to damage the reputation of a lanjruage 
which enjoys widespread use. As stated earlier, C should be considered as a 
high level assembly language. As such, C is a very good tool for its intended 
purpose. However, a more regular language would aid the implementation of a 
DE. A serious effort was made to produce a strict grammatical detinition of 


the language. The results of that effort are contained in Appendix A. 


44 


V. C-SMART USER'S MANUAL 


C-Smart is intended to behave exactly as its name implies. The editor 
is smart, but it is not brillant. It is able to do many things automatically 
and it tries to assist the user in several ways. However, the editor can be 
circumvented if the user so desires. The most salient feature of C-Smart is 
its template system. There is a template for every construct in the C program— 
ming language, including the preprocessor commands. T’he punctuation required by 
each construct can be automatically provided by the editor. The template system 
has five major subsystems. These are the preprocessor, declaration, function 
definition, statement, and comment template components. 

The template system is very closely linked to two other major components 
of the editor, the input monitor component and the macro substitution component. 
Both of these components interact in very basic ways with each of the template 
subsystems. In order to understand the template operation, it is necessary to 
first comprehend the input monitor and macro substitution components. Thus, 


these two components will be discussed first. 


A. INPUT MONITOR 

The input monitor controls all user input to the editor. It serves as a 
line editor within the overall screen editor. The input monitor operates on 
the bottom three rows of the screen in the user input area. This area is 
separated from the rest of the screen by a dashed line, the separation line, 
which has its own function. The line provides a mechanism to display the 
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nesting level of the current line of the program. © relies on the convention 
of using the "{" and "}" characters to mark the beginning and the 
ending of blocks. When a new block is entered, the editor will display the 
"{" character on screen at the appropriate indentation level. Because the 
entire program can not be displayed on screen at one time, it is easy to lose 


track of nesting levels. Thus, whenever the editor displays the opening on 
screen, the matching "}" is displayed on the separation line at a corresponding 
indentation level. At block exit, the "}" is moved from the separation line to 
the screen and a comment is appended to the "}" indicating the type of block 
being exited (eg. "} /* end for */" would be displayed). This action is done 
automatically. Because there are only eighty characters on the separation line, 
the nesting level indications are limited to a depth of sixteen blocks. 

The input monitor recognizes several user commands. These are distinguished 
from text entry by use of the control key (represented graphically as ’T’). 
The control key and the command key should be depressed simultaneously to enter 
a command. Figure 13 lists the control keys recognized by the input monitor. 

When the input monitor recognizes a fa command it will attempt to add every— 
thing from the current cursor position to the end of the user input line as a 
new macro defimtion. If two ta’s are recognized on the same line, the input 
monitor will attempt to add everything between the two marks as a new definition. 
If three or more ta’s appear on the same line, the value of the closest mark 
already defined will be adjusted. Only one macro definition may be made for 
each user input line. The fi command will cause the editor to ignore any 
marks previously set on that line. If additional fa commands are not used, no 


new definition will be made on that line. The tu command instructs the input 
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monitor to use one of the pre-defined macro definitions. There are two ways to 
identify the desired substitution. The user should either enter enough letters 


of the definition to disambiguate it from any other definition or enter a 


eam 


Te exit to the operating system 

tm return 

ta add a macro definition 

hh ignore macro definition request 

tu use a macro definition 

td delete the character to the left of the cursor 
te delete to end of line 

tx x-out the entire line 

ts move the cursor to the start of line 
tn move the cursor to the end of line 
tl move the cursor one character left 
tr move the cursor one character right 





number corresponding to the position on screen of the desired substitution. 


If the first method is used, a blank must follow the disambiguating letters. 
When a blank is not desired in the input, the number designation method may be 
used. The substitutions are numbered 1 to 17 from left to right, top to bottom 
as they appear on screen. When the fu key is pressed, the editor will insert a 
'@’ character at the current cursor position to mark the location where the 
substitution will be inserted. When the return key is pressed, the user input 
line will be moved to the screen and the ’@’ character will be replaced by the 
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indicated substitution. Examples of using the macro substitution feature are 
provided in later discussion. 

The cursor movement commands operate exactly as indicated in Figure 13 and 
require no further explanation. If the editor recognizes the fc command, control 
will be immediately passed to the operating system. Files in use will not be 
saved or closed. When the tm or, more conveniently, the return key is pressed, 
the input monitor will return control to the editor. The return key signals 
that the user has completed his actions. It is often used to signal that no 
further selection is desired from the current menu. 

The td command deletes the character immediately to the left of the cursor. 
The fe command deletes all user input from the current cursor position right— 
ward to the end of line. The tx command will delete the entire user input line 
regardless of cursor position. A fresh input line will be supplied after the 
deletion. 

There are several keys which the input monitor does not recognize. A beep 
IS sounded whenever one of these keys is pressed. The unrecognized keys are 
those which have an ASCII value of less than 32 and have not been designated 
as command keys. Several examples are the delete, backspace, and tab keys. 

A key function of the input monitor is to control line length. The C 
language allows logical lines to occupy more than one physical line when 
the ’\’ line continuation character appears at the end of the continued 
physical line. The editor will recognize lines which are too long and 
will insert continuation characters as needed. 

The editor also controls indentation. Blank spaces will be inserted in 


front of the user input when the screen is refreshed. Thus, the user need never 


48 


concern himself with the indentation required on any line. The editor displays 
the ’*’ character in the user input area to show the user how much physical 
Space, minus leading blanks, is available on the current physical line. If the 
user enters input beyond the ’* mark, the editor will insert the continuation 
character, display the first part of the logical line, and present a fresh 
input line in the user input area so that the logical line may be continued. 
The process of extending logical lines across several physical lines may be con- 


tinued indefinitely. However, unreadable programs could easily result. 


B. MACRO SUBSTITUTION COMPONENT 

The substitution component is intended to allow the user to enter correct 
input only once and use it repeatedly. Information can be called from the sub- 
stitution facility any number of times and at almost any point in the editing 
process. There are two areas of storage reserved for the substitution component. 
These are initialized to support declarations and statement entry options. The 
content of either area may be modified or used by means of the fa and fu keys 
respectively, as discussed in the previous section. 

The declaration support storage values will be visible at any time that 
the current line of the program could contain a declaration. The predefined 
values are the standard types and storage classes available in C. When these 
values are visible, they are displayed on the top five lines of the screen, 
just above the text area. Only the first few characters {11) of any visible 
item will be displayed, regardless of the actual length of that item. This 
restriction is due to the limited screen area available to display the view 


definitions. 
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The presentation of the statement support values is similar. The 
predefined values are functions available in the C standard library. These 
definitions are visible whenever the editor expects a statement to be entered 
on the current line. 

The user is free to define any construct to be stored for future use. The 
limitations are that there may only be 34 such definitions extant at any time 
and that any single definition may not exceed fifty characters in length. If an 
attempt is made to store longer strings. the editor will notify the user. In 
such a case, the user should break up his storage requirement into two or 
more definitions. 

Only one construct per user input line may be saved in a macro storage 
area. Once a construct has been successfully designated for storage, the user 
will be asked which one of the currently visible definitions should be replaced. 
The user should respond by providing a number, 1 thru 17, corresponding to the 
position, on screen, of the view item to replace. Subsequently, the visible 
definitions area of the screen will be updated to reflect the currently stored 
values. 

The substitution component is a convenient means of referencing frequently 
used constructs. It may also be used to temporarily hold a value during editing. 
As an example, if the user wishes to surround a statement with a while template, 
the substitution facility can hold that construct and then reinsert it at the 
appropriate point within the while template. This process provides an alterna— 


tive to the free input mode of program modification (to be discussed later). 
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C. TEMPLATE COMPONENTS 

The template system has five major subsystems. Each of these contain pre - 
defined templates appropriate to some part of the C language. The largest of 
the subsystems is the statement template subsystem. 

The statement subsystem holds templates for each statement type in the C 
language. A menu of these templates will be presented whenever a statement 
could be entered as the current input line. To select a particular template, 
the user holds the shift key and presses the key corresponding to the 
menu designation of his choice. The selected template will then be displayed 
on screen and the user will begin to fill in the parts of the template. 

The only exception to this procedure is the expression template because 
expressions are the lowest level and most frequently used construct in the 
language. Thus, they should be very easy to select. The editor does hold a 
special expression template which may be selected. However, this can be 
cumbersome given the frequency of expression usage. To ease expression entry, 
the editor adopts the convention that anytime a lowercase letter is struck when 
the statement menu is available, the user wishes to enter an expression. The 
editor automatically selects the expression template and accepts the user input 
as an expression. Again, the editor adopts this convention only when the state- 
ment menu is available to the user. 

The comment template is intended to support comment blocks. The opening 
and closing comment characters will be automatically inserted on every physical 
line when the comment template is in use. This template is also able to 
accomplish word wrapping. If user input exceeds the line space available, the 


editor will break that line at the last complete word, add the close comment 
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characters to the line, display the line on screen, insert the leftover portion 
of user input into a fresh input line, and allow the user to continue his 
input. These actions are performed automatically. 

Comments may be inserted without using the comment template. Any template 
will accept a comment as an integral part. Word wrapping will also be performed 
on comments entered with other constructs. Further, if the start comment 
characters are entered on any input line, and no close characters are also 
present, the editor will assume that the line ends in a comment and add the 
close comment characters. 

The function definition template does not operate in a menu driven manner: 
there is only one template available for selection. When the user selects the 
function definition template, he receives the "<type> <function name> (<parameter 
list>)" prompt. The type component is optional and assumed to be ‘integer’ if 
not supplied. The shortest user input under this template would be of the form 
"fname (". On this input, the editor would add the closing ")" indicating an 
empty parameter list, display the input, display the opening block "{" char- 
acter, and move to the local declaration template to await user input. A more 
common class of input would be of the form "fnamefa, b, c" where a, b, and c 
are the formal parameters. On input of this type, the editor would again add 
the closing ")" and display the user input. The next action would be to display 


ne 


in the user input area and ask for a type designation. Once provided, ’a 


} 


a 
and its type would be displayed. Then 'b’ would be displayed in the user input 
area and the “<type> or @" prompt would appear. If the user entered a type, then 
’b and its type would be displayed on the next physical screen line. If ’@’ 


were entered, then ’b’ would be added to the line holding the declaration of 
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a’, indicating that ‘a’ and ‘b’ were of the same type. This process would 
continue until the formal parameter list was exhausted. Then, the local 
declaration template would be provided. 

The declaration template also operates without a menu because there is 
only one template available. This template has the very basic form "“<type> 
<identifier list>". It is very difficult to supply a template for every 
possible legal C declaration. Thus, the user is free to enter anything he 
desires under the declaration template. The only restriction is that the 
storage class and type should be entered first and the identifiers and any 
initializers next. This template can be cumbersome in some cases. As an 
example, to declare a struct with initializers, the user should enter ’struct’ 
the struct name and the opening "{" when given the <type> prompt. When the 
<identifier> prompt is presented, the user should enter the type and name of 
the first struct member. Then, on subsequent <type> and <identifier list> 
prompts, the type and name of struct members should be entered. When the 
last struct member is specified, the user must explicitly add the closing “}" 
and any identifiers declared to be struct’s of the type described. 

The global and local declaration templates are basically the same, having 
only two small differences. Auto and register macro definitions are not 
available when declaring global variables. Also, the local declaration 
template will not accept initializers for variables which do not have the 
static storage class. These restrictions are in keeping with C language 
requirements. 

The preprocessor template subsystem operates in a manner very similar to 


the statement subsystem. There is a template available for every preprocessor 
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command available in standard C and any one may be selected from a menu. As 
with the statement templates, the editor will add ending punctuation. One 
exception to this rule occurs when the "#asm" template has been selected. The 
editor does not have any knowledge of assembly languages. Thus, no assistance 
is available under this template. The user is entirely responsible for ensuring 


that the requirements of his assembler are met. 


D. MENUS 

As discussed, most template selections are menu driven, the only 
exceptions occurring when only one choice is possible. Whenever a menu is 
available, it is displayed on the top two lines of the user input area. Except 
when entering an expression under the statement menu, the user should select an 
option by typing the capitol letter corresponding to the desired menu object. 
In general, once an item has been selected, its template will appear on screen 
and the user will begin to receive prompts to complete the vanous subparts of 
the template. If an illegal selection is made, the user will be so advised and 
requested to enter his selection again. If the user enters no choice {return 
key only), the menu will end and the preceding menu will reappear on screen. 

The top level menu does not work directly with the template system. It 
offers the options as depicted in Figure 14. The cursor up, cursor down, next 
screen, and last (i.e. previous) screen commands operate as their titles imply. 
Cursor movement commands will not present new screens however. The next and 
last screen commands must be used for this purpose. The cursor ( ’>’ } is 
displayed in the far left margin of the line that the editor recognizes as 


being the current line. 
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The append option provides the means of text entry. When selected, text 
will be inserted after the current line. Thus, the cursor should be positioned 
appropriately before the append option is selected. If text is to be entered as 
the first line in the file, the cursor should be positioned at the blank line 
at the top of the first screen. After the append option has been selected, the 


user will be asked if he desires the template system or the free input mode. If 
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Adelete Bsearch Cline up Djine down E-edit 


F append Gnext screen Hlast screen I.quit Jsave 
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the template system is selected, the editor will determine which type of con- 
struct may appear as the current line, set up an appropriate level of 
indentation, and present a menu for template selection. If free input mode is 
selected, the template system will be bypassed entirely. The user is free to 
enter any construct as the editor operates very much lke a conventional editor 
in free input mode. Thus, the user is responsible for all indentation and 
punctuation decisions. Free input is intended to ensure that program modifica— 
tions can be accomplished as easily as possible. A caution is that the editor 
will not recognize the type of any construct entered as free input. Subsequent 
modifications to these constructs can not be made in the template mode. 

The editor contains a very basic search command. Only the buffers currently 
in memory can be searched. If a global search is required, the user will have 
to manually cause the buffers to be exchanged and repeat the search in each 


buffer. The search command does support an ’entity search’ option, primanly 
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intended to locate identifiers. Often, the characters that constitute an 
identifier name may appear as a substring of additional tokens. Conventional 
editors will return every match for a search object, regardless of whether or 
not that match is a unit by itself. Entity search will find a matching string 
only if it is an entity itself. This search mode is possible because the editor 
knows the syntax of C. When a possibly matching string is found, the characters 
before and after the matching sequence are inspected. If either character is a 
legal identifier character, no match is returned. If both characters are legal 
separators (ie. +, * & etc.) then a match is returned. An entity search is 
requested by typing the ’@’ character in front of the target string. This char- 
acter will not be considered when searching. If the ’@’ does not precede the 
target string, a normal search is performed and all matches are returned. 
Once a target has been found, the line containing that target is moved to the 
user input area and may be edited. When the user presses the return key, the 
search is continued. This process repeats until the entire buffer in memory 
has been searched. If no match is found, no lines will be moved to the user 
input area. 

The last two options available from the top level menu are quit and save. 
The difference between them is that quit is a faster means of exit because it 
does not generate a file suitable for input to a compiler. The editor works 
with a file which contains fixed size pages so that text can easily be moved to 
and from the disk. Kach line in this file carries with it data which allows the 
editor to determine the type of that line and other pertinent information. Thus, 
the editor must process this file into a form suitable for a compiler, striping 


off the header information and deblocking each line. 


96 


The editor is only capable of working on files which abide by the above for— 
mat rules. Thus, the editor can only be used on files that it has created. The 
files which are used by the editor can be recognized by their ".dcs" extension. 
These files should not be printed because they contain information which is 


in integer rather than ASCII format. 


BK. A SAMPLE SESSION 

This section contains a sample session with C-—Smart. Program constructs 
from Reference 12 have been selected to be entered as a new program being 
created by the editor. 

The disk containing C-Smart should be placed in the ’'A’ disk drive. The 
file to be edited may reside on any drive. This example session will create 
a file named “sample.c" on the 'B’ disk drive. To invoke the editor, select the 
‘A’ drive and enter "CS B:SSAMPLE.C" followed by a carriage return. Initially, 
the message "C—Smart, Ver 10 NEW FILE" will appear on the screen. Then, the 
screen will clear and be refreshed to appear as illustrated in Figure 15. 

Figure 15 and all other figures depicting screens are slightly off scale. 
The dashed separation lines are longer in the figures than they actually 
appear when they are displayed on the computer monitor. The top line of Figure 
15 displays the name of the file being edited. The next three lines display the 
macro substitutions that are currently visible. Notice that Figure 15 has no 
macro substitutions visible and, thus, none are available. This is because the 
editor is at its top level and no statement can be entered at this point. The 
cursor { ’> ) is positioned at the first line of the file. Sample.c is a new 


file, so its first line is blank. Below the bottom dashed line, the top level 


of 


menu is being displayed and the user is being prompted for a selection. In this 


example, append was chosen to create a new file. 


SSS ee SSS ee Se eee 


File BSAMPLEC 
Visible: 


A.delete Bsearch Cline up Dline down Bedit 
F. append Gnext screen Hlast screen [quit J.save 
select A thru J ==>> _ 





Choosing the append option when creating a new file does not produce the 


option of free input. Instead, the user is presented with template system 
options to enter a block comment, preprocessor commands, global declarations, 
or a function definition. A function definition template was chosen. Then, the 
function name, parameter list, and a comment were entered. Figure 16 depicts 
this situation, just prior No) pressing the return key. Notice that macro 


definitions are now visible. These are types and classes that may be used in 
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the definition of a function. Also, the ’* character appears at the far right 
of the user input line. This is a mark generated by the editor to inform the 
user of the physical space remaining on the current line. There are also two 
identical constructs on the screen, one above and one below the bottom dashed 
line. The top construct is a placeholder, When the user has completed his 


input, it will appear on screen at the position of the placeholder. The construct 
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File BSAMPLE.C 
Visible: char double extern float int 
long short static struct ty pedef union 


<type><function name>(<parameter list>) 


———ooeo oe oe ee ee ee ee em me me me ee cm ee mm cm oe ce ce ce cs ce ee eo re ce es ee ee ee ca eee es es es ee ee ee wee wee wes es wee es es ee es ee oe oe ee 


<type><function name>(<parameter list>) 


shell(v,n) /*sort v(O}..v[n—1] into increasing order : 





below the dashed line 1s a prompt. This is the syntactic element that the 


editor expects the user to enter. A function definition is a rare case where 
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the two constructs are identical. In most cases, the prompt is only a part of 
the placeholder on the screen. 
The user strikes the return key and the screen changes as depicted in 


Figure 17. First, notice that the closing comment characters have been added. 


_——< eS aS Aas SS i ee ———— as Ss eS 


Pile. BSAMPLE.C 


Visible: char double extern float int 
long short static struct ty pedef union 
auto register 


shell(v,n) /* sort v(0]. v[n—1] into increasing order */ 


————<———— ee eee er ee ee es ee ee ee ee ee em Oe ee te ee em em me ee et ee ee ee a ee ee ee we ee we we we we we we wee we we we ee ee es ee ees ee ee ow es ee es es 


Fig. 17 ce ‘Smart Screen 3 





Then, the user has been prompted for a <type> (just below the bottom dashed 


line). Also, the cursor in the user input area is positioned just before the 


Pe 


Character ‘vy’, the name of the first formal parameter. One final change is 


3 


in the ’Visible’ area. Two new definitions have been added. This is because 


60 


"auto" and “register" variables are not allowable in global declarations (as in 


functions), but they may be used in local declarations. 


File: BSAMPLE.C 


Visible: char double extern float int 
long short static struct typedef union 
auto register 


—_—_—orerororor—r wee eee ae ss Se ee we ee ce re oe ee ee ce ec we ar ee we es we we te ee we we me we ee ee ee ee wee ee ee ee ee ee ee ie ie ea a ee ee 


shell(v,n) /* sort v[0] v[n—1] into increasing order */ 
int v; 


—_—_—_——o—eo—e—eoe eee oe eee ee ee a aw a ww ww aw ws we a we a ws ae a ae ae ae ee a ae ee ee ee ae ae ae a ee ee 





The user enters the type int and strikes the return key, producing the 


screen as displayed in Figure 18. The declaration of ’v’ has been moved to the 
screen and the user is receiving the "<type> or @" prompt for the formal para— 
meter ’n’. Both ’n’ and ’v’ are integers, so the user has entered ’@’. Pressing 


the return key produces the screen depicted in Figure 19. Here, the formal 


parameter list has been exhausted, so the editor has entered the opening block 
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"{" and activated the local declaration template. Note that a matching close 


block “}" character is displayed on the bottom separation line. 


= eee ee 


Pile BSAMPLEC 


Visible: char double extern float int 
long short sbatic struct typedef union 
auto register 
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shell(v,n) /* sort v(0|..v[n—1] into increasing order */ 
ink v,n, 


{ 


<type> <identifier list> 
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The user continues to enter his program until the state of Figure 20 has 
been reached. Here, the user is in the middle of entering a for statement. The 
last remaining incomplete part of the for template is "<expression>". Notice 
that a "<statement>" is an integral part of the for template. This fact holds 
true for many statement class templates. Also, notice that a new macro defini— 
tion is visible. These pre-defined values are available whenever a statement may 


be entered. As a final comment regarding Figure 20, notice that the end of line 
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marker on the user input line is much closer to the left margin. This is a 
reflection of the available space remaining on the current (physical) screen 


line. 


ee ee ae 


Pile) BSAMPLE.C 


Visible: atol exit{) fclose( fopen( fprintf( 
fscanf( getchar() iboal printf( putchar( read{ 
scan{{ sizeof sprintf{ sscanf( strepy( write( 


ee ce ee ce ce ee cm eee ee ce ee ce ce ee ce ey em i ee ce ec ce ee ce ew ec ce ee ee i ee ee eee ee ee ee ee ee 


shell{v,n) /* sort v{0]..v/n—1] into increasing order */ 
int v,n; 
{ 


int 1, n, gap; 


for(gap = n/2, gap > 0, gap /= 2) 
for{i = gap, 1 < n; i++) 
for{j = 1-gap, } >= 0 && v{j] > v[j+gap]; <expression>) 
<statement> 


Sa ene = ne -- 2 S8ee--— enn nnnn-nn---------—- 


<expression> 


) -= gap. 





Again, assume that the the user has continued his input up to the point 
reflected in Figure 21. Here, two "}" characters are present on the bottom 
separation line and the statement menu is available. The function "shell" is 
complete and the user is ready to move to the next function definition. The 
user presses the return key one time to exit the statement menu. Then, the 


rightmost "}" is moved from the separation line to the screen and the statement 
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placeholder reappears at an indentation equal to that of the "for(gap = 2..." 
statement. Again, the user strikes the return key and the screen is refreshed 
to resemble Figure 22. At this point, no macro definitions are visible because 
the top level menu is available. Also, note that appropriate comments have 
been added to the closing block characters and that two blank lines have 


been generated to separate function "shell" from the next program construct. 


SSS ee 


Pile BSAMPLEC 


Visible: atol exit() {close fopen( {printf 
[scani( getchar() itoal printi{ putchar( read( 
scan{( sizeof sprint{( sscan{( strcpy( write( 
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shell(v,n) /* sort v{0]..v[n—1] into increasing order *} 
int vn: 
{ 


int 1, 0, gap; 


for(gap = n/2; gap > 0; gap /= 2) 
for(i = gap; 1 < n; i++) 
for(j = i-gap; j >= 0 && v{j] > v{j+gap]; } -= gap) 


temp = vij]; 
vj] = viirgapl, 
vij+gap] = temp; 
<statement> 
Te se ee }atee------25- seeks 4. ____ rr 
A<comment> Bbreak C.continue Ddo B<expression> F <for> G<goto> H<switch> 
lL<af> J<else> K<return> L.<label> M.<multiple> N.<while> O.<;> P.<preprocess> 
Select A thru P ==>>_ 





Figure 23 depicts the situation when the user is including the standard 


library file “printf.c'". He is using the preprocessor template and has received 


the prompt requesting a filename. The user has entered only "printf". The 
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editor assumes default extensions of "cc" and that quotes (as opposed to angle 
brackets) will be used to specify the location of the file. Thus, the editor 


generates the included filename as illustrated in Figure 24. In Figure 24, the 


a eS 


File: BSAMPLE.C 
Visible: 


int 1, n, gap, 


for(gap = n/2; gap > 0; gap /= 2) 
for(i = gap, 1 <n, 1++) 
for(] = i-gap, ) >= 0 && v{j] > vij+gapl; | -= gap) 


temp = v{jl; 

v(j} = vij+gapl, 

v[j+gap} = temp, 

} f* end for * 
} {* end shell * 


—_— Se a ee oe Se ee Se ee ee ee ee es ae a a ee ee ee SS eae SS SS SS SS SS Swe SMe Ow SO SS SP Se Se SS Se SP Se Se ee ee ee ee Se ee ee ee eee ee 


A<comment> B.<global declaration> 
C <function definition> D.<preprocessor> 
Select A thru D ==>> _ 





user is entering local declarations to the function “main". He has provided a 


type of "long" for the current variable "lineno". The user has also tried to 
initialize that variable. The editor has responded that the variable must be 
declared as "static" to be initialized and has repositioned the user input 
on the user input line for editing. At this point, the user should insert the 


static storage class into his declaration before proceeding. 
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SSS SSeS SSS SSS eee 


Pile: BSAMPLE.C 


Visible: atoi exit() fclose( fopen( fprintf{( 
fscan{( getchar() itoa( printi{ putchar( read( 
scan{( sizeof sprinti( sscan{( strepy( write( 


for(gap = n/2, gap > 0; gap /= 2) 
for(i = gap, i < n, i++) 
for(] = i-gap, } >= 0 && v{j] > v[j+gapl; ) -= gap) 

{ 
temp = J], 
vil = vij+eapl, 
v{j+gap] = temp; 
} /* end for */ 

} f{* end shell */ 


#define MAXLINE 1000 
finclude <filename> 


cp cp ce cp ce ee pe a ep ec ee ee ee me ec ee ee ee ee ee ee eee es ee ee ee SS 


printf ‘ 
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ee re ee 


Pile: BSAMPLE.C 


Visible: char double extern float fint 
long short static struct ty pedef union 
auto register 


v[j+gap] = temp, 
)iesend for */ 
} /* end shell */ 


#define MAXLINE 1000 
#include "printf.c" 


main(argc,argy) 
Int argc; 
char ‘argv: 


char line[{ MAXLINB], *s, 
long <identifier list> 


---}------~-----------_-------------~----------~--------------------------------------- 


must be static to have initializer 


lineno = 0; : 
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Figure 25 depicts an interesting property of the editor. The user has 


entered an expression to be used in a while template. The editor knows that 


the input should be a conditional expression. It examines the user input and 


determines that an assignment expression has been entered. This is a very 


SSS SSS 55S 


File: BSAMPLE.C 


Visible. atol exit() fclose( 
fscanff getchar() itoal printi( 
scanf( sizeof sprintf{( sscanf( 


#define MAXLINE 1000 
#include "printf.c* 


main(argc,argy) 
int argc, 
char *argy|]: 


char line[ MAXLINB], *s; 
long static lineno = 0; 
static int except = 0, number = 0, 


while (<expression>) 
<statement> 


~--}-------------------+----------------------------+----- 


Do you want an assignment here? 


--arge > 0 && (*++argy)[0| = ’~’) 


fopen( f print!( 
putchar{ read( 
strepy( write( 


ee ee ee es ee ee ee ee ee ee eee ee es ees eee ee 





frequent error made in C which can cause infinite loops. Recognizing this, the 


editor repositions the input on the user input line with the prompt "Do you 


want an assignment here?". The user may edit his input or, by only pressing the 


return key, leave it unchanged and continue his editing session. 
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Figure 26 exhibits the macro defimition property of the editor. In this 


case, the user wishes to access the function "printf", Instead of typing the 


———— 


File BSAMPLEC 


Visible: atol exit() fclose( fopen( fprint{( 
fscanf( getchar() itoa( printf ( putchar( read( 
scan{( sizeof sprinti( sscan{f strepy( write 
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static int except = 0, number = 0; 


while (--arge > 0 && (*++argyv){0} == ’-’) 
for (s = argv{d] + 1, *s = “\0", s++) 





switch(*s) 
{ 
case ’x’: 
except = 1, 
break, 
case ’n’ 
number = |; 
break; 
default: 
¢statement> 
Meee === ---_i- ee a ee eR 8 -._______._-_--___- 
¢<expression> 
O9"find illegal option %c\n", *s) 
Ry ig. 26 C- een 12 ae 


ae nee eae ee ee wee 
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function name, a substitution will be made. The editor generated the ’@’ char— 
acter when the user issued the Tu command. Then, printf was designated by the 
number 9. When the return key is pressed, “printf(" will be substituted for the 
characters "@9". Figure 27 depicts the screen after the substitution has been 


made. 
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Pile BSAMPLEC 


Visible: atoi exit() iclose{ fopen{ {printf 
fscanf( getchar() itoal print!( putchar( read( 
scanf( sizeof sprintf sscanf{{ strepy{ write( 
switch({*s) 
{ 
case ’x’ 
exceot = 1, 
break; 
case ’n’: 
number = 1, 
break, 
default: 
printi("find: illegal option %c\n", *s); 
arge = 0; 
break, 
} /* end switch */ 


<statement> 
---} see cee me eee ee een ee ee 


A<comment> B.break C.continue Ddo EXexpression> F <for> G<goto> H.<switch> 


Laf> J<else> K<return> L.<label> M.<multiple> N.<while> O<;> P<preprocess> 
Select A thru P == 
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Figure 28 demonstrates a check that the editor can make for the user. The 
editor can ensure balanced parenthesis, double quotation marks, and _ single 
quotation marks. In Figure 28, the user has entered a single quotation mark. 


The editor has discovered this fact and has given the user an opportunity to 


ee ce ee Sr 


Pile: BSAMPLE.C 


Visible: atoi exit() fclose( fopen( f printf 
fscanf( getchar() itoa( printf( putchar( read( 
scanf{ sizeof sprintf sscan{( strepy( write( 
arge = 0; 
break; 


} /* end switch */ 
if (arge != 1) 
printi("Useage: find -x -n pattern\n"); 
else 


while (getline(line, MAXLINE) > 0) 


lineno++; 
if ((index(line, *argy) >= 0) != except) 


if (number) 
printi("%ld: ", lineno); 
<statement> 
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printfi("%s, line); ¢ 
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correct the situation. He is free to edit the line for corrections. He can 
also ignore the situation by pressing the return key. This option may be 
needed if one logical line has been extended over two physical lines and the 


matching quotation mark is on another physical line. 
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Figures 29 and 30 demonstrate the modification capabilities of the editor. 
The user has decided to add a block comment before the function "main". To do 
so, he returns to the top level menu, moves the cursor to the "include" 


line and selects the append option. The editor responds by presenting the 


——— SSS SSS SSS SS ee eee 


File: BSAMPLE.C 
Visible: 


for(gap = n/2; gap > 0; gap /= 2) 
for(i = gap; 1 < n, i++) 
for(j = i-gap; } >= 0 && vi) > vij+gapl; ) -= gap) 


temp = v\jl, 

vii] = vij+gapl, 

vijrgap| = temp, 

} {* end for */ 
} /* end shell */ 


#define MAXLINE 1000 
> finclude “printf.c" 


A.template 


B free 
select A thru B ==>> _ 





template or free input options. The user selects the template operation and, 


subsequently, selects the comment template from the preprocessor menu. In 
Figure 30, the user has continued his comment input beyond one physical line 


and the editor has wrapped the input to a new physical line. 


y2 


eee eee 


File: BSAMPLE.C 


Visible: atoi exit() fclose( fopen( {printf 
fscan{( getchar() itoa( print{( putchar( read({ 
scanf{f sizeof sprinti( sscanf( strepy( write 


for(gap = n/2, gap > 0: gap f= 2) 
for(i = gap; 1 <n, 1++) 
for(} = 1-gap; } >= 0 && v{)] > v{jreapl; } -= gap) 

{ 
temp = vij], 
vii] = vij+gapl, 
vij+gap| = temp, 
Df eend tor */ 

} {* end shell */ 


#define MAXLINE 1000 


finclude "printf.c" 
/* The below function is taken from p. 113, The C Programming */ 
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continue 


/* Language, 
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Figure 31 exemplifies the manner in which the editor advises the user of 
buffer status. In this example, the user has elected the "save" option from 


the top level menu. The status line in the user input area provides the 
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for{i = gap, 1 < n; i++) 
for(j = i-gap, } >= 9 && v{j] > vij+gap); } -= gap) 


temp = yij], 

vil = vli-gapl; 

vij+gap| = temp, 

Pend tore) 
} {* end shell */ 


#define MAXLINE 1000 

#include "printf c* 

{* The below function is taken from p 113, The C Programming */ 
/* Language, by Kernigham and Ritchie */ 
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“writing... message as the compiler suitable file is being generated. The 


"Swapping..." message may also appear when the editor needs to page in a 
new block of program text from the disk. The final status message, 
"merging..." 1S displayed when the internal edit buffer has been filled and 


must be merged with the existing file. 
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VI. CONCLUSIONS 


The primary conclusion to be drawn from this effort is that a basic 
syntax directed editor can be implemented on a minimally resourced machine. 
However, space restrictions must be considered when making any significant 
decision. Time constraints proved to be less of a factor than was originally 
anticipated. Despite a very slow CPU speed, the total hardware/software system 
proved able to provide timely responses in most instances. The only operations 
which were degraded by the low speed concerned the file system. Paging blocks 
in and out as well as merging files when the edit buffer filled proved to be 
very expensive. A faster clock would have helped this situation, but could not 
have completely eliminated the associated user inconvenience. 

The space constraints of the project required tradeoff decisions to be made 
constantly. Many decisions determining the inclusion of facilities were based 
solely on the availability of space. The system now consumes fifty out of an 
available fifty-eight thousand bytes of main memory. Thus, only eight thousand 
bytes remain for run time storage requirements. This has not caused any problems 
in the testing of the editor to date. It is possible that a program requiring 
many levels of nested constructs could need more run time memory than is 
available since the editor accommodates nested structures by recursive 
procedure calls. Test constructs have been successfully nested to sixteen levels, 
which is the deepest level that the screen can display and that the compiler 


used for validation can accomodate. 
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A. TRADEOFPS 

There are several features which were not included in this version due to 
limited space. The first such feature is holophrasting. The data structure which 
contains user input is a doubly linked list of records. Each record consists of 
a line of user input and various syntactic information concerning that line. One 
of the units of information is the tab level of the line. This information was 
specifically included to ease the implementation of the holophrasting facility. 
Further, the tab levels of the templates were designated to support holophrast— 
ing. Thus, all that is needed to implement the feature is the code to obtain 
the users desired level of view and then retrieve the statements at that 
level. Many of the utility functions in the editor could be used to support 
the holophrasting feature. 

Another feature which was planned but not implemented for similar reasons 
concerned a search utility. The data structures are designed to allow the easy 
retrieval of standard syntactic constructs as units. The user would have been 
able to retrieve all "for" statements as an example. Again, all that is needed 
to implement this feature is the user interface. 

A significant feature which was planned was a small scale version of the 
Interlisp DWIM facility. Algorithms were developed to allow the detection and 
correction of spelling errors which involved one or two characters or trans— 
positions. Again, some of the basic utilities in the editor were designed to 
support this facility. The implementation of the DWIM feature would have 
required a large amount of memory, primarily to hold a symbol table, which is 
not now included in the editor. Another problem with the DWIM feature is that 


it could not be implemented in a regular manner. Many C programs are created as 
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separate modules. These modules are included with others at compilation to 
produce a complete program. Identifiers can be declared in one module and 
referenced in another. Thus, the editor could not build a symbol table which 
contained all legal identifier references without a knowledge of the dynamic 
nature of the program. Because it could not be applied regularly and because it 
would require a great deal of storage, the DWIM facility was not implemented. 

The editor does not contain a good search utility, block move, block delete, 
or search and replace options. These could all be included with very little 
effort. Also, no type of "UNDO" mechanism is available. The delete function as 
implemented was designed with an undo capability in mind. Nothing is actually 
removed from the program. Only pointers in the doubly linked list are changed 
so that the deleted item is skipped over. Thus, a simple undo facility would 
be supported by an ability to restore pointers. These features were not included 
due to time constraints on producing a working program. The editor is now 
functional and was designed and implemented in only three months of part time 
effort. The creation of over 2500 line of source code in that period of time 
was a large effort, possibly too large. 

The editor does feature very good support of comment inclusion, both block 
and single line. As originally designed, in line comments were to appear only 
at the end of the program line. This convention was really too restrictive and 
the supporting code was rewritten to allow comments to appear at any place on 
any line. This decision change proved to be relatively expensive, primarily due 
to feature interaction within the editor, First, the editor is able to generate 
much of the punctuation required by the C language. This is an important con- 


venience for the user due to the frequency with which errors related to omission 
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of punctuation occur. Also, because C does not include any run time diagnostics, 
such errors can be difficult to detect. To add ending punctuation, the editor 
first removes any comments that appear on the line. Then, the program part of 
the line is scanned for punctuation, which is added if not already present. 
When comments were only allowed at the end of the line, the saved comments 
could simply be appended to the correctly punctuated program line. When the 
comment convention was changed, the original input had to be saved so that it 
could be compared against the program buffer and the comment buffer in order 
to correctly reassemble the user input. Further, the editor had to be able to 
make decisions when the original input did not match the content of either 
buffer (due to added punctuation as an example). Thus, additional coding and 
storage were required to accommodate the new comment convention. 

A second feature which did not interact well with the comment decision 
concerned physical line continuations. Under the original comment plan, 
splitting one logical line across two physical line was relatively easy. If 
the line contained a comment, the split usually occurred within the comment. 
If no comment was included, then only the line continuation character { "\" } 
had to be positioned in the input line. When the commenting decision changed, 
the lines had to be scanned on continuation to determine where the break should 
occur. Further, if the break came within a comment, it was necessary to determine 
if more code followed the comment on that same line. In this case, not only 
did the comment have to be closed and continued, but the line continuation 
character had to be added to the input outside of the comment. These line 
breaking problems may not appear to be very significant. However, their 


complexity was hightened due to the macro substitution capability. Because 
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of this feature, as few as two input characters could generate as many as 
fifty actual characters. In total, the change to the comment decision caused 
approximately an additional one thousand bytes of code to be generated. This 
seems to be an expensive feature. However, the editor should not enforce 
arbitrary conditions and the user should be able to place a comment anywhere 
he feels one is needed. 

A final significant feature which was planned but not implemented was 
a type of bounds checking for array references. C does not include any run 
time ability to make such checks. Originally, it was planned to enable the 
editor to generate source code which could be implanted in a user program to 
notify him of bounds violations. This feature proved to be very expensive to 
implement. Because of the ways in which arrays can be declared, a great deal 
of code would have been necessary to determine the bounds of an array from its 
static form. Also, implementation of this feature would have been very difficult 
without including a parsing capability or a symbol table. 

One feature was replaced by another. The editor runs on the CP/M 
Operating system, which has some unusual rules governing the allowable char— 
acters in a legal file name. Originally, the editor actually parsed the user 
supplied file names to ensure their conformance to the CP/M rules. Very good 
error messages were provided in the event of an illegal file name. This 
feature was replaced by a facility which checks for assignment statements in 
conditional expressions. This decision was made because the later error is much 
more subtle and there is no way of automatically checking for it. Since the 
operating system would eventually notify the user of an incorrect file name the 


assignment check was deemed to be more valuable. 
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B. DESIGN ERRORS 

The major flaw in this implementation of the editor concerns its file 
system. During the primary design process, it was determined that the file 
system would be as simple as possible so that the major efforts could be directed 
towards the syntax related parts of the editor. This decision proved to be 
very costly. 

Early in the design process, it was recognized that most of the space 
requirements of the editor would be driven by its syntax directed nature. Thus, 
there would be very little space available for text buffers. To overcome this, 
a small scale virtual memory system was planned to facilitate mass input and 
output of pages of text. Fixed size pages were designed so that each page 
contained sixty-four records and one record corresponded to one line of text. 
Because these sizes were fixed, the space available for each line of text had to 
be fixed at eighty characters, the maximum allowable per line. Thus, each line 
(i. e. record) occupies the same amount of space whether it holds a single 
blank or a full eighty characters. 

This paging system has become the major limiting factor and the major 
inconvenience of the editor. Disk storage space was traded for memory space 
and the ability to exchange pages of text as rapidly as possible. The speed of 
the paging operation is still very slow and it is conceivable that using charac— 
ter at a time I/O would not produce a great decrement in system performance. 
If this is the case, it would be a better decision to eliminate the fixed aspect 
of the buffers and accept the lower performance. The current version of the 
editor consumes disk space in six thousand byte blocks. In fact, if only a 


single character is entered as user input, the editor will require six 
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thousand bytes of disk space to store that character. The storage requirement 
is the same as if sixty-four lines of eighty characters each had been entered. 
Because of this, the editor will not be able to operate on source programs 
which exceed approximately one thousand, four hundred lines in length. The 
typical disk would not be able to hold the intermediate files of larger source 
programs. 

A much greater effort should have been directed to the filing system early 
in the design phase. The current file system is so simple that it degrades the 
performance of the editor as a whole. Also, the space required by the utilities 
needed to implement paging, disk random access, merging of files, and other 
similar functions consume a large part of the space required by the editor. 
Approximately eight thousand bytes of compiled code is dedicated to just these 
utilities, 

A second design error concerns the buffers internal to the editor. There are 
three such buffers. One buffer holds the characters displayed on the screen. The 
second holds one page of the user’s program read in from the disk. The last 
buffer is dedicated to insertions which the user may include. These three 
buffers are physically separated. A better design decision would have been to 
combine the last two buffers into one, logically separated structure. 

The screen buffer must keep track of the physical location of each line 
being displayed. If any displayed line is edited, its location in memory must 
be known so that an update can occur. If only one buffer had been used (in 
addition to the screen buffer) simply storing the array index of each line on 
screen would have satisfied this requirement. Currently, the array name and the 


array index must be saved for each displayed line. Then separate sequences of 
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code are required to access and update the screen lines, dependent on which 
array holds the line. The code sequences are very similar; the primary differ— 
ence being the array name. Much of the "duplicate" code could have been elimi— 
nated if only one buffer had been used. 

A third design error concerned the choice of the supported language. Many 
other languages have a much more regular structure and have readily available 
grammars. Not only did the C language have to be learned, but a large amount of 
time was devoted to developing a grammar for the language. The error here is 
that one should have a very thorough knowledge of a language prior to attempt— 
ing the implementation of a syntax directed editor to support that language. 
Moreover, the editor itself is written in C. The amount of code used in this 
implementation could be reduced, simply because different, shorter code 


sequences could be used to accomplish the same result in many instances. 


C. SUMMARY 

A syntax directed editor, although a very powerful and convenient tool, 
can be implemented on a very basic machine. However, the implementation of the 
tool should not be taken lightly. Early design decisions which seem _ peripheral 
to the effort can later become the major limiting factors of the system if not 
sufficiently considered. This fact evinces the importance of the design phase. 
As a general rule, a less than complete design will produce a less than desired 
output. 

Time constraints on the completion of a project can cause some features to 
be excluded. An implementation can always be done better in more time. The 


editor created as a part of this thesis is a useful tool and it can be 
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enjoyable to observe in operation. The system could be much better and could 
include more facilities had a sufficient amount of time been devoted to its 


implementation. 
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APPENDIX A. C LANGUAGE SYNTAX CHARTS 

The following charts have been developed primarily from information found 
in Appendix A to Reference 12. The works of Michael Meissner [Ref. 14] and 
Patrick Fitzhorn [Ref. 15] have also been consulted. 

The syntax charts do not represent an absolute definition of the C 
language. They have been constructed so as to be as exception free as possible. 
However, this goal was not always met and the information contained in 
Appendix A to Reference 12 should be considered as the absolute authority in 
the event of discrepancies. Certain obvious classes have been omitted from the 
charts. The class Letter consists of all upper and lower case letters, a thru 
z. The class Symbol contains any character symbol not represented in the Letter 
class. ‘Character symbol’ is meant to exclude digits and the character 
representation of digits. The sta/ze:zead words in the charts indicate syntactic 
classes (nonterminals}, Letters and words in normal type are meant to indicate 


verbatim representations (terminals). 
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